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Read Text From File 



Clean - Separate HTML Tags From 
Other Text 



Chop - Convert text into a word list 



Score - Recognize the positions of attribute 
labels and values in the word list 



ft 



Focus - Identify a region of interest, which 
contains the data to be extracted 



Segment- Separate the region of interest into 

record regions that each contain text 
corresponding to a single record 



A 



Extract -Determine attribute values for records y 
based on the text in record regions 



Write - Write the set of records to an XML file 



